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Abstract. Given a subset K of the unit Euclidean sphere, we estimate the minimal number 
m = m{K) of hyperplanes that generate a uniform tessellation of K, in the sense that the fraction 
of the hyperplanes separating any pair x,y £ K is nearly proportional to the Euclidean distance 
^ between x and y. Random hyperplanes prove to be almost ideal for this problem; they achieve 

I the almost optimal bound m — 0{w{K)^) where w{K) is the Gaussian mean width of K. Using 

the map that sends x £ K to the sign vector with respect to the hyperplanes, we conclude that 
every bounded subset K of R" embeds into the Hamming cube { — 1, 1}™ with a small distortion in 
. the Gromov-Haussdorff metric. Since for many sets K one has m = m{K) <^ n, this yields a new 

^ discrete mechanism of dimension reduction for sets in Euclidean spaces. 

00 

1. Introduction 

Consider a bounded subset i^T of R". We would like to find an arrangement of m affine hyperplanes 
Ph in that cut through K as evenly as possible; see Figure [T] for an illustration. The intuitive 

notion of an "even cut" can be expressed more formally in the following way: The fraction of the 
"j^ hyperplanes separating any pair x,y ^ K should be proportional (up to a small additive error) 

^ to the Euclidean distance between x and y. What is the smallest possible number m = m{K) 

I— I of hyperplanes with this property? Besides having a natural theoretical appeal, this question is 

directly motivated by a certain problem of information theory which we will describe later. 

> 

in 




^ Figure 1. A hyperplane tessellation of a set in the plane 

'X 

^ In the beginning it will be most convenient to work with subsets K of the unit Euclidean sphere 

5"^^, but we will lift this restriction later. Let d[x,y) denote the normalized geodesic distance on 
5"~^, so the distance between the opposite points on the sphere equals 1. A (linear) hyperplane 
in R" can be expressed as a"*" for some a G R". We say that points x,y G R" are separated by the 
hyperplan^if sign(a, x) / sign(a,y). 
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Definition 1.1 (Uniform tessellation). Consider a subset K C S'^~^ and an arrangement of m 
hyperplanes in R". Let dA{x,y) denote the fraction of the hyperplanes that separate points x and y 
in R". Given 6 > 0, we say that the hyperplanes provide a (5-uniform tessellation of K if 

(1.1) \dA{x,y) - d{x,y)\ < 6, x,y e K. 

The main result of this paper is a bound on the minimal number m = m(K, 6) of hyperplanes 
that provide a uniform tessellation of a set K. It turns out that for a fixed accuracy 6, an almost 
optimal estimate on m depends only on one global parameter of K, namely the mean width. Recall 
that the Gaussian mean width of K is defined as 

(1.2) wiK) = Esnp\{g,x)\ 
where g ~ M{Q,In) is a standard Gaussian random vector in R". 

Theorem 1.2 (Random uniform tessellations). Consider a subset K C S'^~^ and let 5 > 0. Let 

m > C5~^w{Kf 

and consider an arrangement of m independent random hyperplanes in R" uniformly distributed 
according to the Haar measure. Then with probability at least 1 — 2 exp(— cJ^m), these hyperplanes 
provide a 6-uniform tessellation of K. Here and later C, c denote positive absolute constants. 

Remark 1.3 (Tessellations in stochastic geometry). By the rotation invariance of the Haar measure, 



it easily follows that E(i^(x,y) = d{x,y) for each pair x,y € R". Theorem 1.2 states that with 
high probability, djiix,y) almost matches its expected value uniformly over all x,y £ K. This 
observation highlights the principal difference between the problems studied in this paper and 
the classical problems on random hyperplane tessellations studied in stochastic geometry. The 
classical problems concern the shape of a specific cell (usually the one containing the origin) or 
certain statistics of cells (e.g. "how many cells have volume greater than a fixed number"?), see 
In contrast to this, the concept of uniform tessellation we propose his paper concerns all cells 



simultaneously; see Section 1.5 for a vivid illustration. 



1.1. Embeddings into the Hamming cube. Theorem 1.2 has an equivalent formulation in the 
context of metric embeddings. It yields that every subset K C S^~^ can be almost isometrically 
embedded into the Hamming cube {—1, 1}™' with m = 0{vj{K)'^). 

To explain this statement, let us recall a few standard notions. An e-isometry (or almost isom- 
etry) between metric spaces {X, dx) and (y, dy) is a map f : X which satisfies 

\dY{f{x),f{x'))-dx{x,x')\<e, x,x'eX, 

and such that for every y £Y one can find x ^ X satisfying (y, f{x)) < e. A map f : X ^ Y is 
an e-isometric embedding of X into Y if the map / : X — )• f{X) is an e-isometry between (X, dx) 
and the subspace {f{X), dy). It is not hard to show that X can be 2e-isometrically embedded into 
Y (by means of a suitable map /) if X has the Gromov-Haussdorff distance at most e from some 
subset of Y. Conversely, if there is an e-isometry between X and f{X) then the Gromov-Haussdorff 
distance between X and f{X) is bounded by e. 

Finally, recall that the Hamming cube is the set {—1, 1}™ with the (normalized) Hamming 
distance dH{u,v) = ^ Yll^i '^{ui^Vi] = the fraction of the coordinates where u and v are different. 

An arrangement of m hyperplanes in R" defines a sign map / : R" — )• {—1, l}'" which sends 
x G R" to the sign vector of the orientations of x with respect to the hyperplanes. The sign map 
is uniquely defined up to the isometrics of the Hamming cube. Let oi, . . . , G R" be normals of 
the hyperplanes, and consider the mxn matrix A with rows a^. The sign map can be expressed as 

/(x)=sign^x, /:R"^{-1,1}-, 
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where sign 74a; denotes the vector of signs of the coordinates {ai,x) of Ax. The fraction dAix, y) of 
the hyperplanes that separate points x and y thus equals 



dA{x,y) = d// (sign Ax, sign ^y), x,y ^ R". 

Then looking back at the definition of uniform tessellations, we observe the following fact: 

Fact 1.4 (Embeddings by uniform tessellations). Consider a 6-uniform tessellation of a set K C 
5*""^ by m hyperplanes. Then the set K (with the induced geodesic distance) can he 6-isometricaUy 
embedded into the Hamming cube { — 1,1}™". The sign map provides such an embedding. □ 

This allows us to state Theorem 11.21 as follows: 



Theorem 1.5 (Embeddings into the Hamming cube). Consider a subset K <^ ^ and let 6 > 0. 
Let 

m > C5-^w{Kf. 

Then K can he 5 -isometrically embedded into the Hamming cube { — 1, l}™. 

Moreover, let A he an m y. n random matrix with independent M{0, 1) entries. Then with prob- 
ability at least 1 — 2 exp(— C(5^m), the sign map 

(1.3) fix) = sign Ax, f:K^{-l,ir 

is an 6-isometric embedding. □ 

1.2. Almost isometry of K and the tessellation graph. The image of the sign map / in 
( |1.3[ ) has a special meaning. When the Hamming cube {—1, l}*" is viewed as a graph (in which two 
points u, V are connected if they differ in exactly one coordinate), the image of / defines a subgraph 
of { — 1,1}"*, which is called the tessellation graph of K. The tessellation graph has a vertex for 
each cell and an edge for each pair of adjacent cells, see Figure [2j Notice that the graph distance 
in the tessellation graph equals the number of hyperplanes that separate the two cells. Therefore 
the definition of a uniform tessellation yields: 

Fact 1.6 (Graphs of uniform tessellations). Consider a 5-uniform tessellation of a set K C S^^^ . 
Then K is 5-isometric to the tessellation graph of K. □ 



Hence we can read the conclusion of Theorem |1.2| as follows: K is 5-isometric to the graph of its 
tessellation by m random hyperplanes, where m ~ S^^wiK)^ . 




Figure 2. The graph of a tessellation of a set in the plane. The dashed lines 
represent the edges. 
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1.3. Computing mean width. Powerful methods to estimate the mean width w{K) have been 
developed in connection with stochastic processses. These methods include Sudakov's and Dudley's 
inequalities which relate w{K) to the covering numbers of K in the Euclidean metric, and the sharp 
technique of majorizing measures (see (HUB]). 

Mean width has a simple (and known) geometric interpretation. By the rotational invariance of 



the Gaussian random vector g in (1.2), one can replace g with a random vector 9 that is uniformly 
distributed on 5"""^, as follows: 



w{K) = Cn\/n ■ w{K), where t(;(i^) = E sup 

x&K 

Here Cn are numbers that depend only on n and such that c„ < 1 and lim„_j.oo c„ = 1. We may 
refer to w{K) as the spherical mean width of K. Let us assume for simplicity that K is symmetric 
with respect to the origin. Then 2sup3,g^ \ {6,x)\ is the width of K in the direction 0, which is the 
distance between the two supporting hyperplanes of K whose normals are 9. The spherical mean 
width w{K) is then twice the average width of K over all directions. 

1.4. Dimension reduction. Our res ults are a lready non-trivial in the particular case K = S^~^. 
Since iz;(5"^^) < ^/rl, Theorems 1.2 and 1.5 hold with ni ^ n. But more importantly, many 



interesting sets K C ^ satisfy w{K) <C ^/n and therefore make our results hold with m 
w{K)'^ <C n. In such cases, one can view the sign map f{x) = sign^x in Theorem 



1.5 



as a 

dimension reduction mechanism that transforms an n-dimensional set K into a subset of {—1, 1}™". 

A heuristic reason why dimension reduction is possible is that the quantity w{K)'^ measures the 
effective dimension of a set K C S^~^. The effective dimension w{K)'^ of a set K C S'^~^ is always 
bounded by the algebraic dimension, but it may be much smaller and it is robust with respect to 
perturbations of K. In this regard, the notion of effective dimension is parallel to the notion of 
effective rank of a matrix from numerical linear algebra (see e.g. [14J). With these observations in 
mind, it is not surprising that the "true" , effective dimension of K would be revealed (and would be 



the only obstruction according to Theorem 1.5) when K is being squeezed into a space of smaller 
dimension. 

Let us illustrate dimension reduction on the example of finite sets K C S""^^. Since w{K) < 



Cy^log \ K\ (see e.g. [11', (3.13)]), Theorem 1.5 holds with m ~ log|-ftr|, and we can state it as 
follows. 

Corollary 1.7 (Dimension reduction for finite sets). Let K C S"""^ he a finite set. Let 5 > and 
m > C5^^ log \K\. Then K can he 6-isometrically emhedded into the Hamming cuhe {—1, 1}™". □ 

This fact should be compared to the Johnson-Lindenstrauss lemma for finite subsets K G 
(H, see [I2l Section 15.2]) which states that if m > Cd'"^ log \K\ then K can be Lipschitz embedded 
into R'" as follows: 

I II — ^x'||2 — ||x — x'||2| < 5||x — x'||2, X,x' £ K. 



Here A = m~^^'^A is the rescaled random Gaussian matrix A from Theorem ll.5l Note that while 
the Johnson-Lindenstrauss lemma involves a Lipschitz embedding from to R™', it is generally 
impossible to provide a Lipschitz embedding from subsets of R" to the Hamming cube (if there are 
points x,x' £ K that are very close to each other); this is why we consider (5-isometric embeddings. 



Like the Johnson-Lindenstrauss lemma. Corollary 1.7 can be proved directly by combining con- 
centration inequalities for dA{x, y) with a union bound over \K\^ pairs (x, y) £ K x K. In fact, this 
method of proof allows for the weaker requirement m > C6~'^ log \ K\ . However, as we discuss later. 



this argument cannot be generalized in a straightforward way to prove Theorem 1.5 for general 
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sets K. The Hamming distance dA{x,y) is highly discontinuous, which makes it difficult to extend 
estimates from points x,y in an e-net of K to nearby points. 



1.5. Cells of uniform tessellations. We mentioned two nice features of uniform tessellations in 



Facts 1.4 and |1.6[ Let us observe one more property: all cells of a uniform tessellation have small 
diameter. Indeed, d^(x,?/) = iff points x,y are in the same cell, so by (1.1) we have: 

Fact 1.8 (Cells are small). Every cell of a 6 -uniform tessellation has diameter at most 6. □ 

With this, Theorem [L2] immediately implies the following: 



Corollary 1.9 (Cells of random uniform tessellations). Consider a tessellation of a subset K C 
5""^ by m > C6~^w{K)'^ random hyperplanes. Then, with probability at least 1 — exp(— C(5^m), all 
cells of the tessellation have diameter at most 6. 

This result has also a direct proof, which moreover gives a slightly better bound m ~ S~^w{K)'^. 
We present this "curvature argument" in Section |3j 

1.6. Uniform tessellations in R"-. So far, we only worked with subsets K C S^~^. It is not 
difficult to extend our results to bounded sets K C R"'. This can be done by embedding such a set 
K into (the sphere in one more dimension) with small bi-Lipschitz distortion. This elementary 
argument is presented in Section [6| and it yields the following version of Theorem 1.2 



Theorem 1.10 (Random uniform tessellations in R"). Consider a bounded subset K C R"^ with 
diam(i^) = 1. Let 

(1.4) m>C5-''^w{K - Kf. 

Then there exists an arrangement of m affine hyperplanes in and a scaling factor I > such 
that 

\l ■ dA{x,y) - \\x - y\\2\ < 6, x,y £ K. 
Here dA{x,y) denotes the fraction of the affine hyperplanes that separate x and y. 

Remark 1.11 (Mean width in R"). While the quantity w{K — K) appearing in ( |1.4[ ) is clearly 
bounded by 2w{K), it is worth noting that the quantity w{K — K) captures more accurately 
than w{K) the geometric nature of the "mean width" of K. Indeed, w{K — K) = Eh(g) where 
h{g) = sup^g^(g, x) — infx^Kig, x) is the distance between the two parallel supporting hyperplanes 
of K orthogonal to the random direction g, scaled by \\g\\2- 

1.7. Optimality. The main object of our study is m{K) = m{K,5), the smallest number of 
hyperplanes that provide a (^-uniform tessellation of a set K C S"*"^. One has 

(1.5) log2 N{K, 6) < m{K, 6) < C6~^w{Kf, 

where N{K, 6) denotes the covering number of K, i.e. the smallest number of balls of radius 6 
that cover K. The upper bound in (1.5) is the conclusion of Theorem 1.2 The lower bound holds 
because a (5-uniform tessellation provides a decomposition of K into at most 2"* cells each of which 
lies in a ball of radius 5 by Fact |1.8[ 

To compare the upper and lower bounds in ( |1.5[ ), recall Sudakov's inequality |1H Theorem 3.18] 
that yields 

log N{K, 5) < C5'^w{Kf. 

While Sudakov's inequality cannot be reversed in general, there are many situations where it is 
sharp. Moreover, according to Dudley's inequality (see [111 Theorem 11.17] and [13', Lemma 2.33]), 
Sudakov's inequality can always be reversed for some scale 5 > and up to a logarithmic factor in 
n. (See also |T0] for a discussion of sharpness of Sudakov's inequality.) So the two sides of ( |1.5| ) 
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are often close to each other, but there is in general some gap. We conjecture that the optimal 
estimate is 

cw{Kf < sup 5^m{K, 5) < Cw{Kf, 
S>0 

so the mean width of K seems to be completely responsible for the uniform tessellations of K. 



Note that the lower bound in (1.5) holds in greater generality. Namely, it is not possible to have 
m < log2 N{K, 6) for any decomposition of K into 2™' pieces of diameter at most 5. However, from 
the upper bound we see that with a slightly larger value m ~ w{K)'^, an almost best decomposition 
of K is achieved by a random hyperplane tessellation. 

In this paper we have not tried to optimize the dependence of m(K, 6) on 5. This interesting 
problem is related to the open question on the optimal dependence on distortion in Dvoretzky's 
theorem. We comment on this in Section 13.21 



1.8. Related vi^ork: embeddings of K into normed spaces. Embeddings of subsets K C S^^^ 
into normed spaces were studied in geometric functional analysis [8l[15]. In particular, Klartag and 
Mendelson [8] were concerned with embeddings into They showed that for m > C5~'^w{K)'^ 
there exists a linear map ^ : R" — )• such that 

|m"^/^||^x||2 - l| < 5, x^K. 



One can choose A to be an m x n random matrix with Gaussian entries as in Theorem 1.5, or with 
sub-gaussian entries. Schechtman [15] gave a simpler argument for a Gaussian matrix, which also 
works for embeddings into general normed spaces X. In the specific case oi X = f^, Schechtman's 
result states that for m > C6~'^w{K)'^ one has 

|m"-^||74j;||i - l| < 5, x £ K. 



This result also follows from Lemma 12 . 1 1 b elow . 

1.9. Related work: one-bit compressed sensing. Our present work was motivated by the 



development of one-bit compressed sensing in [3l|6l[T^ where Theorem 1.5 is used in the following 
context. The vector x represents a signal; the matrix A represents a measurement map — )• R"* 
that produces m <^ n linear measurements of x; taking the sign of Ax represents quantization of 
the measurements (an extremely coarse, one-bit quantization). The problem of one-bit compressed 
sensing is to recover the signal x from the quantized measurements /(x) = sign Ax. 

The problem of one-bit compressed sensing was introduced by Boufounos and Baraniuk [3]. 
Jacques, Laska, Boufounos and Baraniuk [B] realized a connection of this problem to uniform 
tessellations of the set of sparse signals K = {x £ 5""^ : | supp(x)| < s}, and to almost isometric 



embedding of K into the Hamming cube | — 1, Ij ™. For this set K, they proved Corollary 1.9 with 
m ~ 6~^slog{n/6) and a version of Theorem ] 1.5 [ for m ~ 6~'^slog{n/6). The authors of the present 
paper analyzed in [T7] a bigger set of "compressible" signals K' = {x & S'^~^ : \\x\\i < ^/s} and 



proved for K' a version of Corollary 1.9 with m ~ 6~'^s log(n/s). Since the mean widths of both sets 

holds for these sets with m ~ 6^^slog{n/ s). 
2 is an interesting problem), the prior results 



1.5 



K and K' are of the order s log(n/s). Theorem 
In other words, apart from the dependence of 5 (whic 
follow as partial cases from Theorem |1.5| 



It is important to note that Theorem 1.5 addresses only the theoretical aspect of one-bit com- 
pressed sensing problem, which guarantees that the quantized measurement map f{x) = sign^dx 
well preserves the geometry of signals. But one also faces an algorithmic challenge - how to 
efficiently recover x from /(x), and specifically in polynomial time. We will not touch on this 
algorithmic aspect here but rather refer the reader to [T7j and to our forthcoming work which is 
based on the results of this paper. 
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1.10. Overview of the argument. Let us briefly describe our proof of the results stated above. 
Since the distance in the Hamming cube {—1,1}'" can be expressed as (2m) ~^ y||i, the Hamming 
cube is isometrically embedded in Before trying to embed K C 5"~^ into the Hamming cube 



as claimed in Theorem 1.5 we shall make a simpler step and embed K almost isometrically into 
the bigger space £^ with m ~ 5~'^w{K)'^ . A result of this type was given by Schechtman [15]. In 
Section [2] we prove a similar result by a simple and direct argument in probability in Banach spaces. 
Our next and non-trivial step is to re-embed the set from into its subset, the Hamming cube 



{—1, 1}™. In Sectionjsjwe give a simple "curvature argument" that allows us to deduce Corollary 1.9 



on the diameter of cells, and even with a better dependence on 5, namely m 6 w{K) . However 



a genuine limitation of the curvature argument makes it too weak to deduce Theorem 1.2 this way. 

We instead attempt to prove Theorem |1.2| by an e-net argument, which typically proceeds as 
follows: (a) show that dA{x,y) ~ d{x,y) holds for a fixed pair x,y £ K with high probability; (b) 
take the union bound over all pairs x, y in an finite e-net Ns of K; (c) extend the estimate from 
Ne to K by approximation. Unfortunately, as we indicate in Section [4] the approximation step (c) 
must fail due to the discontinuity of the Hamming distance dA{x,y). 

A solution proposed in |18t [H] was to choose e so small that none of the random hyperplanes 
pass near points x,y £ A^^ with high probability. This strategy was effective for the set K = {x € 
5"~^ : I supp(x)| < s} because the covering number of this specific set K has a mild (logarithmic) 
dependence on e, namely log N{K,e) < slog{Cn/es). However, adapting this strategy to general 
sets K would cause our estimate on m to increase by a factor of n. 

The solution we propose in the present paper is to "soften" the Hamming distance; see Section [4] 
for the precise notion. The soft Hamming distance enjoys some continuity properties as described 



in Lemmas 4.3 and |5.5[ In Section |5.5| we develop the e-net argument for the soft Hamming 
distance. Interestingly, the approximation step (c) for the soft Hamming distance will be based on 
the embedding of K into which incidentally was our point of departure. 

1.11. Notation. Throughout the paper, C, c, Ci, etc. denote positive absolute constants whose 
values may change from line to line. For integer n, we denote [n] = {1, . . . , n}. The ip norms of a 
vector X G K" for p G {0, 1, 2, oo} are defined a^ 



||x||o = I supp(3;)| = \{i G [n] : x{i) / 0}|, ||x||i = \xi\, \\x\\2 = xf) , ||a;||oo = max|xi|. 

We shall work with normed spaces ip = (R", || • ||p) for p G {1, 2, oo}. The unit Euclidean ball in 
R" is denoted B2 = {x G : ||x||2 < 1} and the unit Euclidean sphere is denoted = {x G 

R" : ||x||2 = 1}. 

As usual, AA(0, 1) stands for the univariate normal distribution with zero mean and unit variance, 
and M{0, In) stands for the multivariate normal distribution in R" with zero mean and whose 
covariance matrix is identity /„. 

2. Embedding into ii 

Lemma 2.1 (Concentration). Consider a bounded subset K C R" and independent random vectors 
ai, . . . ,am ^ AA(0, In) in R". Let 



I 1 [2 

Z = sup — |(ai,x)| - \/-||x||2 

x&K^m ^ V TT 



^Note that, strictly speaking, || • ||o is not a norm on 



(a) One has 

(2.1) EZ< 

(h) The following deviation inequality holds: 

(2.2) P 



m 



m 



mu 



2d{Kf 



u > 



where d{K) = maxajg^^ ||2;||2- 



Proof, (a) Note that E |(aj,x)| = y f ||x||2 for all i. Let ei, . . . ,6^ be a sequence of iid rademacher 

random variables. A standard symmetrization argument (see Lemma 6.3]) followed by the 
contraction principle (see |1H Theorem 4.12]) yields that 



E Z < 2 E sup I — V ffi I (ai, x) I < 4 E sup I — V ( 



ai, X) 



4 E sup 

x&K 



^ m 

-E 



1=1 ■^'=" j=l -^^^^ i=l 

By the rotational invariance of the Gaussian distribution, ^ Y^ILi ^i^^i distributed identically 
with g/^/m where g ~ A/'(0, /n). Therefore 



EZ < 



Esup \{g,x) 



4wiK) 



This proves the upper bound in ( |2.1[ ). 

(b) We combine the result of (a) with the Gaussian concentration inequality. To this end, we must 
first show that the map A Z = Z{A) is Lipschitz where A = (oi, . . . , am) is considered as a matrix 
in the space R"'™' equipped with Frobenius norm || • \\f (which coincides with the Euclidean norm 
on R"™'). It follows from two applications of the triangle inequality followed by two applications of 
the Cauchy-Schwarz inequality that for yl = (ai, . . . , am), B = {bi, . . . , bm) £ fR"™ we have 



^ m 

\Z{A)-Z{B)\ < sup- V|(ai-6i,x)| < 



d{K) 



m 



E 



a,; 



d{K) 



\A-B\ 



m 



F ■ 



= 1 i=l 

Thus Z has Lipschitz constant bounded by d{K) / \/m. We may now bound the deviation probability 
for Z using the Gaussian concentration inequality (see [HI Equation 1.6]) as follows: 

P{|Z- EZ| > n} < 2exp(-mnV2(i(i^)^). 



(2.3) 



Z 



sup 



The deviation inequality (2.2) now follows from the bound on E Z from (a). □ 

Remark 2.2 (Random matrix formulation). One can state Lemma [2. 1| in terms of random matrices. 
Indeed, let j4 be an m x n random matrix with independent M{0, 1) entries. Then its rows Oj satisfy 
the assumption of Lemma 2.1, and we can express Z as 

— - J-\ 

m \ TT 

Using this remark for the set K — K, we obtain a linear embedding of K into ii: 
Corollary 2.3 (Embedding into ii). Consider a subset K C £2 and let 5 > 0. Let 

m > C6-^w{K)^. 

Then, with probability at least 1 — 2 exp(— m(^^/32), the linear map f : K 
^\[\Ax is a 6-isometry. Thus K can be linearly embedded into £™ wUh Gromov-Haussdorff 
distortion at most 5. 



defined as f{x) 



Proof. Let A be the random matrix as in Remark 2.2 Using Lemma 2.1 for K — K and noting 



the form of Z in (2.3), we conchide that the foUowing event holds with probability at least 1 
2exp(-m5V32): 



1 



m 



\Ax-Ay\\] 



\x - y\\2 



8w(K - K) IQwiK) 

< = < ^ — < 0, 



m 



m 



x,y £ K. 



□ 



Remark 2.4. The above argument shows in fact that Corollary 2.3 holds for 

m > C6-'^w{K - Kf. 

As we noticed in Remark the quantity w{K — K) more accurately reflects the geometric 

meaning of the mean width than w{K). 



X 2 



(2.4) 



d{K n E). Then Lemma [2^] implies that 
Qw{K) 



Remark 2.5 (Low M* estimate). Note that for the subspace E = kervl we have from (2.3) that 

Z > SUp^g^pi^; 



Ed{Kr\E) < 



m 



By rotation invariance of Gaussian distribution, inequality (2.4) holds for a random subspace E in 
of given codimension m < n, uniformly distributed according to the Haar measure. This result 
recovers (up to the absolute constant 6 which can be improved) the so-called low M* estimate from 
geometric functional analysis, see [HI Section 15.1]. 

Remark 2.6 (Dimension reduction). As we emphasized in the introduction, for many sets C 



one has w{K) <^ n. In such cases Corollary 2.3 works for m n. The embedding of K into 
yields dimension reduction for K (from n to m ^ n dimensions). 

For example, if is a finite set then w{K) < Cy^log \K\ (see e.g. [TTl (3.13)]), and so Corol- 
lary 2.3 applies with m ~ log|A'|. This gives the following variant of the Johnson-Lindenstrauss 



Lemma: every finite subset of a Euclidean space can be linearly embedded in ^™ with m ~ 
log \ K\ and with small distortion in the Gromov-Haussdorff metric. Stronger variants of Johnson- 
Lindenstrauss lemma are known for Lipschitz rather than Gromov-Haussdorff embeddings into 
and i'^ [nils]- However, for general sets K (in particular for any set with nonempty interior) a Lip- 
schitz embedding into lower dimensions is clearly impossible; still a Gromov-Haussdorff embedding 



exists due to Corollary 2.3 



3. Proof of Corollary 11.91 by a curvature argument 



In this section we give a short argument that leads to a version of Corollary 1.9 with a slightly 
better dependence of m on 5. 

Theorem 3.1 (Cells of random uniform tessellations). Consider a subset K C 5"^^ and let 5 > 0. 
Let 

m > C6-^w{Kf 

and consider an arrangement of m independent random hyperplanes in that are uniformly dis- 
tributed according to the Haar measure. Then, with probability at least 1 — 2 exp(— c(5^m), all cells 
of the tessellation have diameter at most 5. 



The argument is based on Lemma 2.1 If points x,y G K belong to the same cell, then the 
midpoint z = ^{x + y) also belongs to the same cell (after normalization). Using Lemma 2.1 one 

can then show that ||2;||2 ~ 5(||2;||2 + ll^/lb) = 1- Due to the curvature of the sphere, this forces the 
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length of the interval — y||2 to be small, which means that the diameter of the cell is small. The 
formal argument is below. 

Proof. We represent the random hyperplanes as {cj}^, where ai, . . . ,am ~ J\f{0,ln) are indepen- 
dent random vectors in R". Let 6,m be as in the assumptions of the theorem. We shall apply 
for the sets K and ^{K + K) and for u = e/2, where we set e = 5^/16. Since the diame- 



2.1 



Lemma 

ters of both these sets are bounded by 1, we obtain that with probability at least 1 — 2 exp(— C(5^m) 
the following event holds: 



(3.1) 



vr 1 
2 m 



\V\\2 



i=l 



<e, V e Ku'^{K + K). 



Assume that the event ( |3.1[ ) holds. Consider a pair of points x,y ^ K that belong to the same 
cell of the tessellation, which means that 

sign{ai,x) = sign(ai,y), i G [m]. 

To complete the proof is suffices to show that ||x — y||2 < S. This will give desired diameter 6 in 
the Euclidean metric. Furthermore, since for small 6 the Euclidean and the geodesic distances are 
equivalent, the conclusion will hold for the geodesic distance as well. 

We shall use (3.1) for x,y G K and for the midpoint z := ^(x + y) £ ^{K + K). Clearly 
sign{ai,z) = sign(ai,x) = sign(ai,y), hence 

\{ai,z)\ = \{ai,x)\ + \{ai,y)\, i G [m]. 



Therefore we obtain from (|3.1|) that 
1 



(3.2) 



^2 > 



> 



2 m 



\{ai,z)\ - 

i=l 

xh -£ + Wvh 



— ^ rn I — ^ rn 

^|(ai,x)| + J^-^|(ai,y)| 



2 m 
1 



i=l 

2e. 



By the parallelogram law, we conclude that 



y\\l 



4-\\x + y\\l = iil-\\z\\l)<16e 



i=l 



5\ 



This completes the proof. 



□ 



3.1. Limitations of the curvature argument. Unfortunately, the curvature argument does not 
lend itself to proving the more general result, Theorem |1.2| on uniform tessellations. To see why, 
suppose y G A' do not belong to the same cell but instead (iyi(x, y) = d for some small d G (0, 1). 
Consider the set of mismatched signs 

\T\ 

T := [ie [m] : sign{ai,x) 7^ sign(aj, y)}; 



d. 



m 



These signs create an additional error term in the right hand side of (3.2), which is 

1 



(3.3) 



V|(oi,w»)| 



where Vi £ {x, y}. 



By analogy with Lemma 2.1 , we can expect that this term should be approximately equal |T|/m = d. 
If this is true, then (3.2) becomes in our situation ||z||2 > 1 — 2e — d, which leads as before to 
11^ ~ ylli ^ ^ + ^- Ignoring e, we see that the best estimate the curvature ar gum ent can give is 
d{x,y) < ^y dA{x, y) rather than d{x,y) < dA{x,y) that is required in Theorem 



1.2 



The weak point of this argument is that it takes into account the size of T but ignores the 
nature of T. For every i £ T, the hyperplane {aj}"*" passes through the arc connecting x and y. 
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If the length of the arc d{x,y) is smaU, this creates a strong constraint on Oj. Conditioning the 
distribution of Oj on the constraint that i G T creates a bias toward smaller values of |(aj,x)| and 



\{ai,y)\. As a result, the conditional expected value of the error term (3.3) should be smaller than 
d. Computing this conditional expectation is not a problem for a given pair x, y, but it seems to 
be difficult to carry out a uniform argument over x,y €z K where the (conditional) distribution of 
tti depends on x,y. 



We instead propose a different and somewhat more conceptual way to deduce Theorem 1.2 from 



Lemma 2.1 This argument will be developed in the rest of this paper. 



3.2. Dvoretzky theorem and dependence on 6. The unusual dependence 6~'^ in Theorem 3.1 
is related to the open problem of the optimal dependence on distortion in the Dvoretzky theorem. 
Indeed, consider the special case of the tessellation problem where K = 5""^ and w{K) ~ y/n. 



Then Lemma 2.1 in its geometric formulation (see equation (2.3) and Corollary 2.3) states that ^2 
embeds into whenever m > Ce~'^n, meaning that 

{l-e)\\x\\2 < ll^xlli < (l + e)||x||2, X G R", 

where $ = y^m^- Equivalently, there exists an n-dimensional subspace of ^™ that is (1 + e)- 
Euclidean, where n ~ e'^m. This result recovers the well known Dvoretzky theorem in V. Milman's 
formulation (see [51 Theorem 4.2.1]) for the space and with the best known dependence on e. 
However, it is not known whether is the optimal dependence for see [15] for a discussion of 
the general problem of dependence on e in Dvoretzky theorem. 

These observation suggest that we can reverse our logic. Suppose one can prove Dvoretzky 
theorem for i"^ with a better dependence on e, thereby constructing a (1 + £)-Euclidea n sub space 



of dimension n ~ f{s)'m with /(e) ^ e^. Then such construction can replace Lemma 2.1 in the 



curvature argument. This will lead to Theorem 3.1 for K = S"'~ with an improved dependence 
on 6, namely with m ~ f{6'^)n. Concerning lower bounds, the best possible dependence of m on 
6 should be 6~^, which follows by considering the case n = 2. This dependence will be achieved if 
Dvoretzky theorem for is valid with n ~ e^^'^m. This is unknown. 

4. Toward Theorem II. 2t a soft Hamming distance 



Our proof of Theorem 1.2 will be based on a covering argument. A standard covering argument 
of geometric functional analysis would proceed in our situation as follows: 

(a) Show that dA{x., y) ~ d{x, y) with high probability for a fixed pair x, y. This can be done using 
standard concentration inequalities. 

(b) Prove that ^^(x, y) « d(x, y) uniformly for all x, y in a finite e-net A'e of K. Sudakov's inequality 
can be used to estimate the cardinality of via the mean width w{K). The conclusion will 
follow from step 1 by the union bound over (x,y) £ Nf, x N^^. 

(c) Extend the estimate dA^x, y) ^ d{x, y) from x,y £ Ns to x,y £ K hy approximation. 

While the first two steps are relatively standard, step (c) poses a challenge in our situation. The 
Hamming distance d^^x, y) is a discontinuous function of x, y, so it is not clear whether the estimate 
d^^x, y) ~ d(x, y) can be extended from a pair points x,y G to a pair of nearby points. In fact, 
for some tessellations this task is impossible. Figure [3] shows that there exist very non-uniform 
tessellations that are nevertheless very uniform for an e-net, namely one has dA{x,y) = d{x,y) for 
all x,y £ Ngr. The set K in that example is a subset of the plane R^, and one can clearly embed 
such a set with into the sphere 5^ as well. 

To overcome the discontinuity problem, we propose to work with a soft version of the Hamming 
distance. Recall that m hyperplanes are determined by their normals ai, . . . ,am G R") which we 
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Figure 3. This hyperplane tessellation of the set K = [—5,5] x [—§,§] is very 
non- uniform, as ah cells have diameter at least 1. The tessellation is nevertheless 
very uniform for the e-net = eZCi K, as dA{x, y) = \\x — y\\2 for all x,y £ Ne. 

organize in an m x n matrix A with rows a,. Then the usual ("hard") Hamming distance dA_{x, y) 
on with respect to A with can be expressed as 

^ m 

(4.1) dA{x,y) = — where Si = {sign(ai,x) / sign(ai,y)}. 

i=i 

Definition 4.1 (Soft Hamming distance). Consider an m x n matrix A with rows oi, . . . , Um, and 
let t £R. The soft Hamming distance (i^(x,y) on is defined as 



^ m 



where 



m 
1=1 



(4.2) Fi = {(ai,x) > t, {ai,y) < -t}U{-{ai,x) > t, -{ai,y) < -t}. 

Both positive and negative t may be considered. For positive t the soft Hamming distance 
counts the hyperplanes that separate x, y well enough; for negative t it counts the hyperplanes that 
separate or nearly separate x,y. 

Remark 4.2 (Comparison of soft and hard Hamming distances). Clearly (i^(x, y) is a non-increasing 
function of t. Moreover, 

(i^(x, y) = dA{x, y) for t = 0; 
d\{x, y) < dA{x, y) for t > 0; 
d\{x, y) > dA{x, y) for t < 0. 

The soft Hamming distance for a fixed t is as discontinuous as the usual (hard) Hamming distance. 
However, some version of continuity emerges when we allow t to vary slightly: 

Lemma 4.3 (Continuity). Let x,y,x\y' G R", and assume that \\Ax'\\oa < £, ||^y'||oo < £ for 
some e > 0. Then for every t E [R one has 

d'J-'{x,y)<d'^{x + x',y + y')<d'^'{x,y). 



Proof. Consider the events J-i = J-i{x, y, t) from the definition of the soft Hamming distance (4.2). 
By the assumptions, we have \ {ai,x')\ < e, \ {ai,y')\ < e for all i G [m]. This implies by the triangle 
inequality that 

Fi{x, y,t + e) C Ti{x + x',y + y', t) C ^^{x, y, t - e). 
The conclusion of the lemma follows. □ 



We are ready to state a stronger version of Theorem |1.2| for the soft Hamming distance. 

Theorem 4.4 (Random uniform tessellations: soft version). Consider a subset K C S'"^"'^ and let 
5 > 0. Let 

m > C5-^w{Kf 
12 



and pick t £ R. Consider anmxn random ( Gaussian) matrix A with independent rows ai, . . . , am ~ 
J\f{0,ln)- Then with probability at least 1 — exp(— c5^m), one has 

\d\{x,y)-d{x,y)\<6 + 2\t\, x,y e K. 
This theorem will be proved in the rest of the paper. 

5. Proof of Theorem 14.41 on the soft Hamming distance 

We will follow the covering argument outlined in the beginning of Section [4j but instead of 
dA{x,y) we shall work with the soft Hamming distance y). 

5.1. Concentration of distance for a given pair. At the first step, we will check that y) ~ 
d{x, y) with high probability for a fixed pair x, y. Let us first verify that this estimate holds in 
expectation, i.e. that Ed^(x,2/) ~ d{x,y). One can easily check that 

(5.1) EdA{x,y) = d(x,y), 

so we may just compare Ed\{x,y) to EdA{x,y). Here is a slightly stronger result: 

Lemma 5.1 (Comparing soft and hard Hamming distances in expectation). Let A be a random 
Gaussian matrix be as in Theorem\4-4\ Then, for every t £ R and every x,y £ R", one has 



Ed'A{x,y)-d{x,y)\ < E\d\{x,y) - dA{x,y)\ <2\t\. 



Proof. The first inequality follows from (5.1) and Jensen's inequality. To prove the second inequal- 



ity, we use the events £{ and J-i from Equations (4.1), (4.2) defining the hard and soft Hamming 
distances, respectively. It follows that 



E \d\{x, y) - dA{x, y)l = E I - ^(l£, - Ij-, 



m 



< E — IjrJ (by triangle inequality and identical distribution) 

= P{£:i AJ-i} 

< P{\{ai,x)\ < |t|} + P{|(ai,y)| < 

< 2P{|ff| < (where 5 ~ A/'(0,1)) 

< 2|t| (by the density of the normal distribution). □ 
Now we upgrade Lemma |5.1| to an concentration inequality: 



Lemma 5.2 (Concentration of distance). Let A be a random Gaussian matrix as in Theorem 4-4 
Then, for every t G R and every x,y £ R", the following deviation inequality holds: 

P{\d\ix,y) - d{x,y)\ > 2\t\ + 6} < 2exp{-26^m), 6 > 0. 

Proof. By definition, m ■ d^^{x,y) has the binomial distribution Bin(m,p). The parameter p 
E(i^(a;,7/) satisfies by Lemma 



5.1 



that 

\p-d{x,y)\ < 2\t\. 
A standard Chernoff bound for binomial random variables states that 

R{\d\{x,y) -p\ > 6} < 2exp(-252m), 6 > 0, 
see e.g. [21 Corollary A. 1.7]. The triangle inequality completes the proof. □ 
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5.2. Concentration of distance over an e-net. Let us fix a small e > whose value will be 
determined later. Let be an e-net of K in the Euclidean metric. By Sudakov's inequality (see 
im Theorem 3.18]), we can arrange the cardinality of to satisfy 

(5.2) log\N,\ <Ce-^w(Kf. 

We can decompose every vector x £ K into a center xq and a tail x' so that 

(5.3) x = xo + x', where xq e N^, x' e {K - K) n . 

We first control the centers by taking a union bound in Lemma |5.2| over the net A^^^: 

Lemma 5.3 (Concentration of distance over a net). Let A a random Gaussian matrix be as in 
Theorem 4-4 Let he a subset of S^~^ whose cardinality satisfies (5.2). Let 6 > 0, and assume 
that 

(5.4) m > Ce-^6-^w{Kf. 

Let i G R. Then the following holds with probability at least 1 — 2exp(— 5^m).' 

\dA{xo,yo) - d{xo,yo)\ < 2\t\ +6, xo,yo G iV^. 
Proof. By Lemma 5.3 and a union bound over the set of pairs (xq, yo) G A^e x N,,, we obtain 

P| sup \d^ji^ix,y) - d{x,y)\ > 2\t\+6\ <\Nef ■2exp{-26^m) <2e^p{-5'^m) 



where the last inequality follows by (5.2) and (5.4). The proof is complete. 



□ 



5.3. Control of the tails. Now we control the tails x' G {K — K) n ei?2 in decomposition (5.3). 
Lemma 5.4 (Control of the tails). Consider a subset K C S"*^-*^ and let e > 0. Let 

m > Ce~^w{Kf. 



Consider independent random vectors ai, . . . , Om 
2exp(— cm), one has 



J\f(0,Ln)- Then with probability at least 1 

^ m 

-^|(ai,x')| <e for all x' € (K-i^)neB^. 



i=l 



Proof. Let us apply Lemma 2.1 for the set T = {K — K) n ei?2 instead of K, and for u = e/8. 
Since d{K) = maxj-zgy ||x'||2 < £, we obtain that the following holds with probability at least 
1 — 2 exp(— cm): 



sup 



^ m ^ m 

- Kai>2;')| < sup — V |(ai,x')| 



4=1 



F 2 



+ 



(5.5) 



^w(T) e 



e. 



im o V vr 

Note that w{T) < w{K — K) < 2w{K). So using the assumption on m we conclude that the 



quantity in (5.5) is bounded by claimed. 



□ 
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5.4. Approximation. Now we establish a way to transfer the distance estimates from an e-net 
to the full set K. This is possible by a continuity property of the soft Hamming distance, which we 
outlined in Lemma 4.3, This result requires the perturbation to be bounded in L^o norm. However, 
in our situation the perturbations are going to be bounded only in Li norm due to Lemma 5.4 So 
we shall prove the following relaxed version of continuity: 

Lemma 5.5 (Continuity with respect to Li perturbations). Let x,y,x',y' G R", and assume that 
ll^x'lli < em, ll^y'lli < em for some e > 0. Then for every t £ R and M > 1 one has 



(5.6) 



"A 



ix,y) 



^ < d\{x + x\y + y') < d^-^^(x, y) + ^. 



Proof. Consider the events J-i = J-i{x,y,t) from the definition of the soft Hamming distance (4.2). 
By the assumptions, we have 



^\{ai,x')\ < em, ^\{ai,y')\ < em. 



i=l 



i=l 



Therefore, the set 

T := {i G [m] : |(ai,x')| < Me, \{ai,y')\ < Me} satisfies |r^| < 2m/M. 
By the triangle inequality, we have 

J^i{x,y,t + Me) ^Ti{x + x',y + y',t) ^ T^{x,y,t - Me), ieT. 



Therefore 



jt+Me 



1 l-^"^! f 



e) 



1=1 



2 

< h 

~ M m 



-El 



Ti{x+x' ,y+y' , 



,t) <j^+d\{x + x',y + y') 



This proves the first inequality in (5.6). The proof of the second inequality is similar. 



□ 



5.5. Proof of Theorem 4.4, Now we are ready to combine all the pieces and prove Theorem 4.4 



To this end, consider the set K, numbers 5, m, t, and the random matrix A as in the theorem. 
Choose e = and M = 10/5. 

Consider an e-net Ng, of K as we described in the beginning of Section 5.2 Let us apply 
Lemma 5.3 that controls the distances on Ng, along with Lemma 5.4 that controls the tails. By the 
assumption on m in the theorem and by our choice of e, both requirements on m in these lemmas 
hold. By a union bound, with probability at least 1 — 4exp(— c(5^m) the following event holds: for 
every XQ,yQ G NNf. and x' ,y' G {K — K) n £-62 ' o'^^ 



(5.7) 



(5.8) 



\d'x^''{xo,yo)-d{x(,,yo)\ < 2\t-Me\+6/2, 

\d'J;^^'{xo,yo)-d{xo,yo)\ < 2\t + Me\+6/2, 
< em, \\Ay'\\i < em. 



Let x,y € K. As we described in (5.3), we can decompose the vectors as 



(5.9) x = xo + x', y = yQ + y , where xq, yo G N^, x' , y' £ {K - K) n eB^. 
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The bounds in (5.8) guarantee that the continuity property (5.6) in Lemma 5.5 holds. This gives 

2 

<5 2 



<d{xo,yo) + 2\t\+2Me + - + ^ 



(by (5.7) and the triangle inequality). 



Furthermore, using (5.9) we have 



\d{xo,yo) - d{x,y)\ < d{xo,x) + d{yo,y) < \\xo - x\\2 + ||yo - vh < 2e. 



It follows that 



d'^ix, y) < dix, y) + 2|t| + 2Me +2 + ^+2^' 



Finally, by the choice of e and M we obtain 

d\{x,y) < d{x,y) + 2\t\+5. 

A similar argument shows that 

d\{x,y) > d{x,y) -2\t\- 5. 

We conclude that 

\d\{x,y)-d{x,y)\ <5 + 2\t\. 
This completes the proof of Theorem |4.4[ 

6. Proof of Theorem 11.101 on tessellations in 



□ 



In this section we deduce Theorem 1.10 from Theorem 1.2 by an elementary lifting argument 
into [R"^-'^. We shall use the following notation: Given a vector x G R" and a number t G R, the 
vector a;©iG[R"'©[R = [R"+^ is the concatenation of x G R" and t. Furthermore, K (Bt denotes 
the set of all vectors x®t where x £ K. 

Assume C R" has diam(X) = 1. Translating K if necessary we may assume that £ K; then 

(6.1) - < sup ||2;||2 < 1. 

2 x£K 

Also note that by assumption we have 

(6.2) m > C6'^^w{K - K) > C5-^^w{K). 

Fix a large number t > 2 whose value will be chosen later and consider the set 

K' = Q{K (Bt)<ZS'' 

where Q : R""*"-^ — )• 5" denotes the spherical projection map Q{u) = ti/||u||2. We have 

w{K') < r^w{K © t) (as ||u||2 > t for ah u G K © t) 

<r^{w{K) + t'£\-f\) (where 7 ~ 7V(0, 1)) 

= r^w{K) + ^/2J^ < ?,w{K) 

where the last ine qual ity holds because w{K) > ^y2/^Tsup^^J^ ||x||2 > l/\/27r by (6.1). 

implies that if m > C6q^w{K)'^ for some 6q > 0, then there exists an 



Then Theorem 



1.2 



arrangement of m hyperplanes in R""*"^ such that 
(6.3) \dA{x',y') - d{x',y')\ < 6o, x',y'GK'. 

Consider arbitrary vectors x and y in K and the corresponding vectors x' = Q{x (B t) and y' = 
Q{x © t) in K' . Let us relate the distances between x' and y' appearing in (6.3) to corresponding 
distances between x and y. 
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Let Oj © a G [R"+^ denote normals of the hyperplanes. Clearly, x' and y' are separated by the i-th 
hyperplane if and only if x®t and y®t are. This in turn happens if and only if x and y are separated 
by the affine hyperplane that consists of all x S R" satisfying (oj © a, x © t) = {ai,x) + at = 0. 
In other words, the hyperplane tessellation of K' induces an affine hyperplane tessellation of 
and the fraction dA{x\ y') of the hyperplanes separating x' and y' equals the fraction of the affine 
hyperplanes separating x and y. With a slight abuse of notation, we express this observation as 

(6.4) dA{x ,y ) = dA{x,y). 

Next we analyze the normalized geodesic distance d{x',y'), which satisfies 

(6.5) \^.d{x',y')-\\x'-y'\\2\<Co\\x'-y'\\l 

Denoting = ||a; © t||2 and ty = ||y © t||2 and using the triangle inequality, we obtain 
e := |||x'-y'||2 - - y||2| = \\\t^^{x®t) -ty'^{y®t)\\^ - \\t~^ x - t'^ y\\2\ 

(6.6) < 11x11 - ^""^1 + ||y|| \ty^ - t'^\ + t \t~'^ - ty^\. 
Note that (|6?T1) yields that t < U,ty < Vf^ + 1. It follows that -t'^\ < 0.5t"^ and the same 

-2 



bound holds for the other two similar terms in (6.6). Using this and (6.1 ) we conclude that e < t 



Putting this into (6.5) and using the triangle inequality twice, we obtain 



Ivr • d{x', y) - r^llx - y||2| < Co{t-^\\x - y'h + ef + £ < Co{2t~^ + r'^f + < Cit 



Finally, we use this bound and ( |6.4[ ) in (6.3), which gets us 
(6.7) [vrt • (iA(x, y) — llx — y||2 1 < vrt(5o + Cit^^ . 

Now we can assign the values t := 2Ci/6 and Sq = (5^/(47rCi) so the right hand side of (6.7) is 



bounded by 6, as required. Note that the condition m > C6q ^w{K)'^ that we used above in order 



to apply Theorem 1.2 is satisfied by (6.2). This completes the proof of Theorem 1.10 □ 
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